Audio-driven Nonlinear Video Diffusion

نویسندگان

  • Anna Llagostera
  • Pierre Vandergheynst
چکیده

In this paper we present a novel nonlinear video diffusion approach based on the fusion of information in audio and video channels. Both modalities are efficiently combined into a diffusion coefficient that integrates the basic assumption in this domain, i.e. related events in audio and video channels occur approximately at the same time. The proposed diffusion coefficient depends thus on an estimate of the synchrony between sounds and video motion. As a result, information in video parts whose motion is not coherent with the soundtrack is reduced and the sound sources are automatically highlighted. Several tests on challenging real-world sequences presenting important auditive and/or visual distractors demonstrate that our approach is able to prevail regions which are related to the soundtrack. In addition, we propose an application to the extraction of audio-related video regions by unsupervised segmentation in order to illustrate the capabilities of our method. To the best of our knowledge, this is the first nonlinear video diffusion approach which integrates information from the audio modality.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech-driven facial animation using a hierarchical model - Vision, Image and Signal Processing, IEE Proceedings-

A system capable of producing near video-realistic animation of a speaker given only speech inputs is presented. The audio input is a continuous speech signal, requires no phonetic labelling and is speaker-independent. The system requires only a short video training corpus of a subject speaking a list of viseme-targeted words in order to achieve convincing realistic facial synthesis. The system...

متن کامل

Pragmatic comprehension of apology, request and refusal: An investigation on the effect of consciousness-raising video-driven prompts

Recent  research  in  interlanguage  pragmatics  (ILP)  has  substantiated  that  some  aspects  of pragmatics are amenable to instruction in the second or foreign language classroom. However, there  are  still  controversies  over  the  most  conducive  teaching  approaches  and  the  required materials.  Therefore,  this  study  aims  to  investigate  the  relative  effectiveness  of  conscio...

متن کامل

Denoising of Audio Data by Nonlinear Diffusion

Nonlinear diffusion has long proven its capability for discontinuity-preserving removal of noise in image processing. We investigate the possibility to employ diffusion ideas for the denoising of audio signals. An important difference between image and audio signals is which parts of the signal are considered as useful information and noise. While small-scale oscillations in visual images are n...

متن کامل

Audiovisual Attention Modeling and Salient Event Detection

Although human perception appears to be automatic and unconscious, complex sensory mechanisms exist that form the preattentive component of understanding and lead to awareness. Considerable research has been carried out into these preattentive mechanisms and computational models have been developed for similar problems in the fields of computer vision and speech analysis. The focus here is to e...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011